Running head: MULTIPLE RATING-SCORE REGRESSION DISCONTINUITY Regression discontinuity designs with multiple rating-score variables
نویسندگان
چکیده
In the absence of a randomized control trial, regression discontinuity (RD) designs can produce plausible estimates of the treatment effect on an outcome for individuals near a cutoff score. In the standard RD design, individuals with rating scores higher than some exogenously determined cutoff score are assigned to one treatment condition; those with rating scores below the cutoff score are assigned to an alternate treatment condition. Many education policies, however, assign treatment status on the basis of more than one rating-score dimension. We refer to this class of RD designs as “multiple rating score regression discontinuity” (MRSRD) designs. In this paper, we discuss five different approaches to estimating treatment effects using MRSRD designs (response surface RD; frontier RD; fuzzy frontier RD; distance-based RD; and binding-score RD). We discuss differences among them in terms of their estimands, applications, statistical power, and potential extensions for studying heterogeneity of treatment effects. Multiple rating-score regression discontinuity 1 Regression discontinuity designs with multiple rating-score variables Introduction Regression discontinuity (RD) designs for inferring causality in the absence of a randomized experiment have a long history in the social sciences (see Cook, 2008) and have become increasingly popular in recent years (e.g., Cook, Shadish, & Wong, 2008; Jacob & Lefgren, 2004; Journal of Econometrics, 2008; Ludwig & Miller, 2007). Because the mechanism for selection into the treatment/control condition is known and observable in a regression discontinuity design, RD can provide unbiased estimates of treatment effects under much weaker assumptions than required for other quasi-experimental methods (Cook et al., 2008). Traditional RD utilizes a discontinuity in the receipt of treatment along a continuous measure (referred to as the rating score, running variable, or forcing variable), and estimates the treatment effect as the difference in the estimated limits of the average observed outcomes on either side of the discontinuity. Some recent examples of the application of RD designs in educational research include studies of Reading First (Gamse, Bloom, Kemple, & Jacob, 2008), Head First (Ludwig & Miller, 2007), public college admission policies (Niu & Tienda, 2009; Kane, 2003), and remedial education (Jacob & Lefgren, 2004; Matsudaira, 2008). However, many education policies rely on more than one rating score to determine treatment status. For instance, state high school exit exam policies often condition diploma receipt on student test scores in both mathematics and English language arts (e.g., Martorell, 2005; Papay, Murnane, & Willett, 2010; Reardon, Atteberry, Arshan, & Kurlaender, 2009; Ou, 2009). Similarly, rigid cutoff scores on multiple rating scales are used for determining services 1 We note that this is distinct from one rating score variable with multiple cutoff scores resulting in multiple treatment conditions (e.g., Black, Galdo, & Smith, 2005). As long as there is one rating score—regardless of the number of cutoff scores in the rating score—we refer to this class of RDs as “single rating score RDs” or single RSRDs. Multiple rating-score regression discontinuity 2 for English learners in California (Robinson, 2008, under review) and higher education financial aid programs (Kane, 2003). Likewise, school accountability policies that label schools “failing” (e.g., No Child Left Behind) often base the label determination on whether multiple subgroups of students each attain their annual objectives. In such cases—where treatment assignment is determined on the basis of two (or more) continuous rating scores—the basic logic of traditional regression discontinuity applies. Nevertheless, regression discontinuity designs using multiple rating scores (hereafter, multiple rating score RD, or MRSRD) are distinctly different from RD designs using a single rating score in that the combination of cutoff scores attained determines treatment status. As a result, designs incorporating multiple rating-score variables raise three issues not present in the single rating score case: First, multiple rating scores may determine assignment to more than two treatment conditions. Second, rather than provide estimates of a single estimand for a single population (the effect of the treatment for individuals with rating scores near the cutoff score), MRSRD may provide estimates of multiple estimands (corresponding to the multiple possible treatment contrasts and for different subpopulations). And third, the analyst is faced with a wider range of strategies for estimating treatment effects from a multiple rating score regression discontinuity. The choice among these different strategies has important implications for precision, bias, and generalizability. Despite considerable recent work on the statistical underpinnings and practical estimation of regression discontinuity using a single rating score (see the special issue of Journal of Econometrics, 2008), the current literature lacks a thorough examination of issues concerning the study of program effects when multiple cutoff scores are used to determine eligibility or Multiple rating-score regression discontinuity 3 participation. In this paper, we outline these issues, and describe their implications for the estimation of treatment effects. This paper addresses these issues and offers suggestions for implementation. In the next section, we provide a brief review of the single rating score RD estimator and then generalize the single RSRD design to the multiple rating score case. Here, we discuss how cutoffs in multiple rating score variables can create multiple treatment contrasts, leading to many possible estimands. The following section discusses several approaches to estimating average local treatment effects with multiple rating score variables. We discuss the assumptions and implementation concerns related to each approach. The next section addresses issues related to power in estimating the effects. Heterogeneity of treatment effects, which can be studied with MRSRD, is discussed in the following section. Our final section concludes with a comparative review of the MRSRD methods discussed in the paper, as well as a set of practical suggestions for analyzing data from MRSRD designs. A brief review of the RD estimator We frame our discussion in terms of the potential outcomes framework (see Fisher, 1935; Heckman, 1979; Holland, 1986; Neyman, 1923/1990; Rubin, 1978). First, consider the standard regression discontinuity design where treatment is assigned on the basis of a single rating score. Let indicate the rating variable, with the cutoff score at 0, such that cases with 0 are assigned treatment and cases with 0 are assigned treatment . Each individual has two potential outcomes, one outcome (denoted ) that will result if the individual is assigned to Multiple rating-score regression discontinuity 4 treatment , and another ( ) that will result if he or she is assigned to treatment . The expected outcome under treatment for individuals with is denoted | ; . Under the assumption that and are continuous functions of at 0, the average effect of nt b ritten as treatme relative to at 0 can e w
منابع مشابه
Regression Discontinuity Designs With Multiple Rating-Score Variables
In the absence of a randomized control trial, regression discontinuity (RD) designs can produce plausible estimates of the treatment effect on an outcome for individuals near a cutoff score. In the standard RD design, individuals with rating scores higher than some exogenously determined cutoff score are assigned to one treatment condition; those with rating scores below the cutoff score are as...
متن کاملRANDOM AND CUTOFF - BASED ASSIGNMENT 1 Running Head : RANDOM AND CUTOFF - BASED ASSIGNMENT A Randomized Experiment Comparing Random to Cutoff - Based Assignment
Regression discontinuity designs (RDD) assign participants to conditions using a cutoff score, with those above the cutoff going to one condition and those below to another. Statistical theory shows that a correctly implemented and analyzed RDD gives unbiased effect estimates, just as in a randomized experiment. This study tests that theory by randomly assigning 588 participants to be in a rand...
متن کاملRegression Discontinuity Designs in Epidemiology: Causal Inference Without Randomized Trials
When patients receive an intervention based on whether they score below or above some threshold value on a continuously measured random variable, the intervention will be randomly assigned for patients close to the threshold. the regression discontinuity design exploits this fact to estimate causal treatment effects. in spite of its recent proliferation in economics, the regression discontinuit...
متن کاملUnderstanding Regression Discontinuity Designs As Observational Studies
Thistlethwaite and Campbell (1960) proposed to use a “regression-discontinuity analysis” in settings where exposure to a treatment or intervention is determined by an observable score and a fixed cutoff. The type of setting they described, now widely known as the regression discontinuity (RD) design, is one where units receive a score, and a binary treatment is assigned according to a very spec...
متن کاملRegression Discontinuity Designs in Epidemiology
When patients receive an intervention based on whether they score below or above some threshold value on a continuously measured random variable, the intervention will be randomly assigned for patients close to the threshold. The regression discontinuity design exploits this fact to estimate causal treatment effects. In spite of its recent proliferation in economics, the regression discontinuit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010